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SUMMARY 

The LMS algorithm and learning Identi- 
fication, which presently are typical adap- 
tive algorithms, have a problem in that the 
speed of convergence may decrease greatly 
depending on the property of the Input sig- 
nal. To avoid this problem, this paper pre- 
sents a geometrical discussion as to the 
origin of that defect, and proposes a new 
adaptive algorithm based on the result of the 
investigation. Comparing the convergence 
speeds of the proposed algorithm and the 
learning identification by numerical experi- 
ment by computer, great Improvement was veri- 
fied. The algorithm is extended to a group 
of algorithms which includes the original 
algorithm and the learning identification, 
which are called APA (affine projection 
algorithm) . It Is shown that APA has some 
desirable properties, such as, the coefficient 
vector approaches the true value monotonic- 
ally and the convergence speed is Independ- 
ent of the amplitude of the input signal. 
Clear conclusions are also obtained for the 
problem as to what noise is included in the 
output signal when an external disturbance 
is Impressed or the degree of the adaptive 
filter is not sufficient. 

I . Introduction 

The filter with a kind of learning 
function, in which the input signal and the 
output signal to produce (desirable output) 
are specified and the coefficients of the 
filter are modified successively so that the 
output approaches the desired output, is 
called adaptive filter. It has been applied 
to many problems such as automatic equalizer, 
echo canceller, and noise-elimination device 

The most important problem in the 
adaptive filter is the algorithm of how to 
modify the coefficients successively. A 



number of studies has been made on the al- 
gorithm, the most well-known being the U15 
algorithm fl]. The algorithm has the fea- 
ture in that it is simple, requires less 
computation time and is easy to implement 
on hardware. The adaptive filter using this 
algorithm is already on the market. On the 
other hand, it has a problem In that the con- 
vergence speed cannot be made very high, ana 
it is not suited to application requiring a 
fast convergence. It has another problem In 
that the convergence speed depends greatly on 
the property of the input signal. 

Another algorithm similar to IMS algor- 
ithm, is the learning identification [2J. 
This algorithm can be regarded as an im- 
provement of the LMS algorithm. Although the 
computational complexity Increases, It has 
many desirable features in that the converg- 
ence speed is higher, and Is ihdependent of 
the amplitude of the input signal. An attempt 
has been made to apply the algorithm to the 
echo-canceller t31. However, the learning 
Identification has a problem common to uib 
algorithm in that the convergence speed is 
degraded sometimes depending on the property 
of the input signal. 

This paper Is an attempt to solve the 
problem in those typical adaptive algorithms 
up to the present **ere the convergence speed 
may be degraded. By analyzing the algorithm 
from a geometrical viewpoint, a new adaptive 
algorithm Is derived. Its effectiveness is 
verified and its characteristics are dis- 
cussed. 

Section 2 outlines the learning identi- 
fication and LMS algorithm, and describes 
their problems. Section 3 derives a new 
adaptive algorithm based on the S^o"" 
considerations. The effectiveness of the new 
algorithm is verified by a numerical experl 
ment by computer. 
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Fig. 1- Linear system identification by an 

adaptive filter. 



In Sect- 4, the algorithm derived in 
solution 3 is examined from the viewpoint of 
orthogonal projection to affine subspace. 
Then the algorithm is extended to a more 
general algorithm. Several practically im- 
portant properties are discussed » including 
the behavior under external disturbance. 



2. Past Algorithm 

The algorithm proposed in this paper 
can be considered as an extension of the 
learning identification. The learning iden- 
tification is outlined first, and then the 
LMS algorithm is reviewed within that frame- 
work. The problems of those algorithms are 
described from a geometrical viewpoint » which 
gives way to the new algorithm proposed in 
this paper* 

2.1 Formulation of the problem and 
notations 

The adaptive filter can be formulated 
as a problem of identifying a linear system. 
Consider, as in Fig. 1, an unknown system to 
be identified, which gives the output (yj) 
determined by 



for input signal (xj). In the above, w|» W2» 



• • • 



» . • 



w^ are unknown constants. The signal 



X_l, XQ, XI, X2t 

{xj} and y-i, yo. 

sented by Cyj)- 



yi 



is represented by 
y2i ' is repre- 



Then consider another linear system in 
which the output {zj} for the same input 
signal {xj} is determined by 




Fig. 2. Geometrical illustration of the 
learning method « 1, n = 3). 



vector of the adaptive filter at time j 



in the form 



successively* to make it approach the coeffi- 
cient vector of the system to be identified: 



w 



The vector :.Vj in the above expression is 
restricted to a function of the input and 
output values up to time j: 

y^ -« » ^; • ' " • 

By the choice of f, various kinds of algor 
ithms can be obtained. 

The following notations are used in 
addition to those already used: 

($) ^ ( , x^., » , 

(ii) For a= ( , fl^)' , 

and 6= ( h^r- f ^n)' 

n 



(Hi) II a II a J <a ,a> 



1 = 1 



Vff x^ 



-4 + 1 



It is called adaptive filter ► The adaptive 
algorithm is used to modify the coefficient 



The set llj defined in <iv) is the set of all 
coefficient vectors which give the output 
equal to yj for the input vector x j , and 
forms a hype rp lane in the n-dimensional Euclid- 
ean space. 
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2.2 Learning identification (2) 

In the learning identification, the 
coefficient vector is modified as follows 

1* Setting of initial value: VQ 
as arbitrary value. 

2* Iteration: 
2.2° «; = y^ "^j 



Zi" Vj^x =Vj ^$i^Vj 

The constant v is called the relaxation con- 
stant. As is shown in Fig. 2, when u = L 
v^+i is the end of the vertical line from Vj 
d%; ||vj+i -wll S llvj - w| I holds when 
w < 0 or w - 2. Consequently, 0 < P < ^ 
must be the case for the coefficient vector 
to converge to w. Then, | Ivj+i - wjl = | IVj 
- wl 1 and the convergence is monotonic. 
When {xO is multiplied by a constant, 6Vj 
and consequently the convergence speed do 
not change. This is another desirable prop- 
erty of the learning identification. 

2.3 IMS algorithm (I] 

Step 2.3' of the learning identifica- 
tion is modified as 

which results in the IMS algorithm. The 
direction of coefficient modification is the 
same as that of the learning Identification, 
and Vj+l is a point on the vertical line 
from Vj to 

In the past. IMS algorithm is under- 
stood as an approximation to the steepest 
descent method, but the geometrical view- 
point is better to understand its behavior. 
In LMS algorithm, the monotonic property ot 
the convergence of the coefficient vector 
is not ensured, and the convergence speed de- 
pends on the amplitude of {xj}. From. this 
point, the learning identification is more 
desirable than the IMS algorithm. 

2.4 Problems in learning identifica- 
tion and LMS algorithm 

For simplicity, the learning identifi- 
cation with p o 1 is considered. As is 
seen from Fig. 2. the convergence speed ot 
the coefficient vector depends greatly on 
the angle between Ilj and Rj-i. In other 
words, when the angle between Hj and nj-i 
approaches 0 or n. 



\\v, 

and the convergence speed is decreased. Let 
the angle between and flj-l be 6, which is 
also the angle between Xj and Xj-i: 



cos = 



< 



lU, ll-llx,..ll 



The right-hand side of the above equation is 
nothing but the first-order sample autocorre- 
lation function of the signal (xjl. Conse- 
quently, the convergence speed decreases as 
the first-order autocorrelation function ot 
the signal approaches 1 in absolute value. 
The situation is the same for the case ot p 
1 and in LMS algorithm. 

This phenomenon arises because the di- 
rection of coefficient modification is re- 
stricted to that of Xj . To improve '^e situa- 
tion, the direction of the coefficient modi 
flcation should be reconsidered. 

3. Hew Adaptive Algorithm and Its 
Convergence Speed 

3.1 Construction of algorithm 

AS is seen in Fig. 2. to keep the con- 
vergence speed constant. Independently of the 
angle between Xj and Xj-i. the vertical line 
shLld be drawn' from to O^M-l. ^otto 
n.. Let the end of the vertical line be v,+i 
a^id introducing the relaxation «f f 
the same way as In the learning Identiflca 
tion, the following algorithm can be con- 
structed (Fig. 3): 

I" Setting of initial value: vq 
» arbitrary value. 

2" Iteration: Vj+I = Vj + u (^j+l " 
v.). When p = 1. the iteration of this al- 
gorithm can be written as follows: 



2.1' 

2-2° 
2.3" 
2.4° 

2.5° 







fix,.. II' 



X.-l 



e 



e 



The properties of this algorithm and the coi. 
pHtaJioS for P 1 are discussed as the gen- 
eral theory in the next section. 
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Fig. 3- 



Geometrical illustration of the new 
algorithm (u ■ l» n « 3). 



3.2 Convergence speed 

To compare the convergence speeds of 
the learning identification and the method 
proposed in this paper, a numerical experi- 
ment was performed by a computer. As is 
shown in Fig. ^, a colored noise, which is 
obtained by the normally distributed random 
numbers throtigh a first-order recursive 
filter, is used as the signal (xj). The 
autocorrelation function of this signal is 
the filter coefficient a itself, and the con- 
vergence speed was examined by varying the 
value of a. Nearly the same coefficient 
vector is used for the system to be measured 
as is used in the numerical experiment in [2]. 

In the experiment, the following num- 
ber of steps 3(e) is determined. Instead of 
the convergence speed, which is defined as 
follows: 

;(e)=min(;;|lr^ -w\\/\\w\\^ e] , ro=0 

Figures 5(a) and (b) show the relation 
between e and j(€) for various values of ot 
for the learning identification and the new 
algorithm both with p = I. The order of the 
filter is set as 16. By this comparison, 
the following observations are made. 

(i) When a » 0, there is no great 
difference, although the new algorithm gives 
a slightly faster convergence. As is seen 
from Figs. 2 and 3, if Hj and nj.i are always 
orthogonal, the learning identification and 
the new algorithm are the same. When a « 0, 
the sitaution is close to this. 

(ii) When a approaches 1, in 
learning identification rapidly increases, 
while it does not change much in the new 
algorithm. This is anticipated from the geo- 
metrical interpretation of the algorithm. 
When tt « 0.99, the convergence speed of the 
new algorithm is more than 10 times the 
learning identification. 
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Fig. 4. A colored gaussian noise generated 
by a Ist-order recursive filter. 



4. Extension of Algorithm [4, 5, 6] 
4.1 APA (affine projection algorithm) 

Ilj or II j n Tlj.i used so far, does not 
necessarily contain the origin of R^. Con- 
sequently, It is not necessarily a subspace 
of as a vector space, but an affine sub- 
space. Let n be an affine subspace of E^. 
The mapping, which is an orthogonal projec- 
tion of on n is written as Pjj' Rising 
this notation, the learning identification for 
p « 1 is written as the coefficient modifica- 
tion algorithm by 

and the proposed algorithm is that by 



From such a viev/point, these algorithms 
can easily be extended. An algorithm that 
performs the modification by 



(1) 



is considered. The vector Vj+i in Eq . (1) is 
the solution of the system of equations with 
V as unknowns; 



■ 



(2) 



which minimizes |lv - Vj||. Letting the cocf 
ficient matrix of the left-hand side of Eq. 
(2) be 

and the constant in the right-hand side be 

and letting Xj**" be the Moore-Penrose general- 
ized inverse of Z j , it can be written as (71 
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10" 10 ' 10 • , 10 ' 10 * 10 

(a) LEARNING METHOD 

:c=-0.00 -A-:o»^0.9S 



(b) NEW ALGORITHM 
X : 0*0.99 



- o 



FlK. 5. Comparison of the convergence time between 
the learning method and the new algorithm. 



(3) 



where 1 is the unit matrix. 

Based on above equation and introduc- 
ing the relaxation coefficient u, the fol- 
lowing adaptive algorithm is considered: 

l" Setting of initial value: vo 
a arbitrary value. 

2° Iteration: 

2.1° Jr^ =*J(tf; P,) 

2.2° P^ + I +/'^«'; 

In this paper, this algorithm is 
called APA (affine projection algorithm) and 
p is called its order. According to this 
definition, the learning identification is, 
the first-order APA, and the algorithm in 
the preceding section is the second-order 
APA. 

4.2 Fundamental properties of APA 

This section describes three funda- 
mental properties of APA. 

Property I. If 0 < w < 2. j |vj+i -wl } 
< jjvj - w||. If w S 0 or p ^ 2, llvj+i - 
wM'^ llvj - w||. 

Proof. Let Vj+i » vj + i.Vj. Then 

Since w e Hj n Hj-l n ••• n nj-(p-l) 
V44.1 - w and ti,v\ are orthogonal to each 
other. Consequently, by Pythagoras' theorem. 



||p^-a,|l*=MD,r + l|r,*,-«»r. 
Consequently » 

Thus, the result is obtained. (End of Proof.) 

From the above property, it is seen 
that in order for the coefficient vector m 
APA to converge Co v. 0 < v < 2 is necessary 
It is seen also that If w Is in this range, 
the coefficient vector never goes away from 
« i e.. the convergence is monotonlc. It i 
nit necUsarily true that 0 < u < 2 is the s 
ficient condition for the convergence. 

Property 2. Let 0 < ^ < 2 and p > q- 
Let Che coefficient vector vj be modified fc 
once by pth and qth order APA, and the resul 

ing coefticient vectors be vj+l<P^ and Vj+i' 
respectively. Then, 



II II ^11 p;';.-"' II 



Proof. Let w - 1. and the coefficien 
vectors obtained by modifying v* as above b 
v!+i(p) and vj+i(l). respectively. Then by 
the same reasoning as in Eq. (^). 



lle;^A--r=ll^J''H.^-^..ll'+||^'^^ 

+ ( -Wll* 



-u'lr 



Similar 
Consequently 



relation applies to llvj+i<<l^ - »! 
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on the other hand, by the property of che 
orthogonal projection, 

holds tn general for affine subspace H sH', 
where • indicates the composition of mapp- 
ings. Since p > <l is assumed, 

Consequently, 

It is seen from this that vj+i W - ¥ and 
y.^^(p) - Vj+l<«>) are orthogonal, and 

Using this inequality, Eq. (5) and 0 < u < 2. 
the result is obtained. (End of Proof.) 

It is anticipated from Property 2 that 
the convergence speed may be increased by 
increasing the order. 

Property 3. Let {Sj} be the signal 
obtained by multiplying the amplitude of the 
input signal (xj) by a (a !^ 0) , i.e., Xj 
- a xi. Starting from the initial value vq, 
let the coefficient vector obtained by using 
the input signal (xj) at time j be v j . and 
that obtained by using the input signal ixjj 
be Vj, respectively. Then, 



Proof . Let 

T : X <D Moore - Penrose generalized inverse 
■' matrix 

Then, by the definition of APA, the coeffi- 
cient vector Vj for the input vector ixj) Is 
successively determined as 



Obviously, Xj - a Xj and yj - a yj, and 181 



jr* = — X. 



Consequently, from Eq. (7), 

Avj = X* (. V, ^ X^v^) 

Thus, if Vj = Vj. iVj - Avj. and ^j+l = ^1+} 
follows from Eq. (S)"! From this and Eq. (6). 
the result is obtained by -^^^-J^f ^f//,:, 
tion. 

From Property 3. it is seen that the 
behavior of the coefficient vector and the 
convergence speed in APA do not change it a 
Soilew constant is multiplied with the input 
signal. This implies that one does 
to consider the adjustment of the a^Pl^tude of 
the input signal, which is a very ^^^^^f ^ 
property from the practical viewpoint. It 
Sas already described in Sect. 2/^at the^ 
learning identification has the ^"P^'Jf 
and 3. but those properties are shared in com- 
mon by the whole APA. 

4.3 The case where external disturbance 
exists 

So far, it is assumed that no signal is 
impressed on the system of Fig. 1 other than 
the input signal (xj). In practical applica- 
tions, however, there are many ^ 
other signals exist, which is "^t negligible. 
For example, in the application of the adap- 
tive filter to the elimination of two input 
signals as in Fig. 6 {1], the signal from the 
noise source corresponds to (xj) in Fig. 1. 
^ich is required to identify the sy8«m to 

be identified (I.e.. the "^n^^^r^r^J^-l'k 
from the noise source to the input terminal) 
and the signal from the signal source is not 
the one considered up to the point. The sig- 
nal from the signal source, however, is the 
sianal to be picked up, and if it is ne- 
glected, the whole problem will become mean- 
ingless. 

The situation in Fig. 6 can be modelled 
as in Fig. 7. In the following, this system 
Is used to consider the effect of the external 
disturbance on the coefficient vector Vj and 
the output signal (ej). In che model in Fig. 
7, the following three coefficient vectors 
are defined. 

(l) Let {yi) be the desired output. 
Let the coefficient vector, which is obtained 
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Fig. 6. Schematic diagram of a 2-input noise 
canceller using an adaptive filter. 
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Fig. 7. Systeoi Identification when a dlsti 

ance signal exists. 



by successive modification by APA of the 
coefficient vector with the initial value vq 
be V j . 

(2) Let {yj^^M be the desired output. 
Let the coefficient vector which is obtained 
by successive modification by APA of the 
coefficient vector with the Initial value 

VQ be V j ( ^ ^ . 

(3) Let {yj(2)) be the desired output. 
Then let the coefficient vector which is ob- 
tained by successive modification by APA of 
the coefficient vector with the initial value 
0 be vj(2). 



Then, as can easily be verified 



+ V 



it) 



(9) 



The definition for Vj(^) is nothing hut that 
for the coefficient vector without external 
disturbance, and Vj (2) 

is the term newly 
produced by the external disturbance. 

Using Eq. (9), the output signal ej 
can be decomposed as 

- ti) . (2) , (3) 



where 



e 
t 
e 



(1) 

J 

W 
J 

<S) 

J 



_ (I) 



it} 



The signal ej(U is equal to the output sig- 
nal without external disturbance. The sig- 
nal e^(2) is the applied disturbance itself; 
ej(^) is the term produced by application 
of the external disturbance. 

Consider a case where the output sig- 
nal {ej} converges to 0 if there is no ex- 
ternal disturbance. Then after a sufficient 
elapse of time. 



(3) 



Thus consider how ei ^'^ changes with thi 
change of amplitudes of the input signal f: 
and the external disturbance {yj (2) }.. Froi 
the definition of vj(2). 



Using the notations 

o . • o0 ( ^ + 1 ; ; ) ( A > > ) 



Vj(2) can be represented as 



(2) 



( 



When {yj(2>} is multiplied by a, 
does not change and U]^^^^ is multiplied by 
Consequently, by Eq. (10), v.(2) is niultip. 
by a« and ej(3) is multiplied by a. When 
is multiplied by b (b ?t 0) , Xj is multlpli* 
by b and Xj+ is multiplied by 1/b, as is d« 
scribed in the proof of Property 3. Conse- 
quent ly» J-Ck; j) does not change and is 
multiplied by 1/b. By Eq. (10), Vj(2) is 
multiplied by I/b» but Xj Is multiplied by 
and does not change. 

Consider the case where y is changed 
When u approaches 0, u^ approaches 0 and ^ 
k) approaches X. Consequently, by Eq. (10 
Vj(2) approaches 0 amd ej(3) approaches 0. 
In the application of the adaptive filter 
the noise elimination in two- input case, 
{ej(2)} (= (yj(2)}) is the desirable signa 
and {ej(3)} is the output noise. Conse- 
quently, the following property is derived 
from the above discussion. 
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ana y on vej vj/ , 



Output con?'*'\^^ 
ponent ^^s^ 


External dts~ 
curbance lyj^^^; 
multiplied by a 


Input signal 
by b 


Relaxation coef- 
ficient 
g 0 


{ejd)) 


No change 


Multiplied by b 


Convergence speed 
0 




Multiplied by a 


No change 


No change 




Multiplied by a 


No change 


0 


{e^(«)) 


No change 


Multiplied by b 


No change 


{ej(5)) 


No change 


Multiplied by b 


0 



Property 4. When APA is used in the 
noise elimination in two-input system, the 
signal-to-noise ratio of the output is in- 
dependent of the input signal-to-noise ratio, 
and approaches infinity when the relaxation 
constant approaches 0. 

A. 4 Adaptive filter of insufficient 
order 

So far, only the case is considered 
where the order of the system to be identi- 
fied is the same as that of the adaptive 
filter. In practice, however, they usually 
do not coincide. No problem occurs when the 
order of the adaptive filter is larger than 
that of the system to be identified. Con- 
sider how the output signal {ej} changes 
when the order of the adaptive filter is not 
sufficient . 

Assume that the system to be identi- 
fied in Fig- 7 has an infinite order, and 
let the coefficients be wi, W2, The 
output of this system at time j is given by 



uii X. -4 + 1 



%^ich is decomposed into two components as 



n 



X. - 



As in the previous section, let yj^^^ 
be the desired ouptut* Let the coefficient 
vector obtained by successive modification 
by APA of the coefficient vector with Che 
Initial value 0 be Vj(3). Then vj is de- 
composed into three components as 



(11) 



where v j , Vjd) and Vj^^) are vectors defined 
in the previous section. Using Eq. (11), ej 
is decomposed into five terms as follows: 

- (I) . <2> (3). (4) . (s> 
'J^'J ^^j ^'j 

where ej(0, ej(2) and ej(3) are the vari- 
ables defined in the previous section, and 

Consider how those terms change when 
the amplitude of {xj} is changed. When {xj} 

is multiplied by b (b 0), obviously ej(^) 
is multiplied by b. Letting 



- ( y; 



<3> 



(3) \i 



we can write 



J 4 = 0 



(12) 



in the same way as for vj(2) in the previous 
section. When {xj } is multiplied by b (b ^ 0) 
Xj"^ is multiplied by l/b and yj(3) is multi- 
plied by b. Consequently, uj does not 

change and. by Eq . (12), vjO) does not change 
Thus, ej(5) is multiplied by b. In other 
words. Both ej (^) and ej (5) are terms pro- 
portional to the amplitude of the input sig- 
nal {xj). 

Thus^ when the order of the adaptive 
filter Is not sufficient, a noise component 
proportional to the amplitude of the input 
signal {xj} appears in the output, and Prop- 
erty 4 does not apply. The effects of the 
input signal {xj}, external disturbance 
{yj(2)> and the change of the value of v on 

are summar ized in Table I. 
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5. Conclusions 



The qualitative properties of the 
adaptive algorithm APA proposed in this 
paper are clarified by this study. How- 
ever, some quantitative properties requite 
further study. In the case where the con- 
vergence speed of the adaptive filter de- 
grades greatly depending on the property 
of the input signal, an approach by lattice 
filter is also proposed 19). A problem left 
for further study is to compare the n«^its 
and demerits of the lattice filter with APA. 



A problem in APA is that the computa- 
tional complexity for a modification In- 
creases with the order. Taking the recent 
hardware progress into consideration, the 
computational complexity will be overcome In 
the near future. APA with second or higher 
order will become practical In the future. 
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